Overview

Dataset statistics

Number of variables19
Number of observations306795
Missing cells288655
Missing cells (%)5.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory44.5 MiB
Average record size in memory152.0 B

Variable types

Numeric6
Text12
Categorical1

Alerts

age is highly overall correlated with age_rangeHigh correlation
age_range is highly overall correlated with ageHigh correlation
publication_range is highly overall correlated with year_of_publicationHigh correlation
year_of_publication is highly overall correlated with publication_rangeHigh correlation
language is highly imbalanced (96.4%)Imbalance
location_country has 13975 (4.6%) missing valuesMissing
location_state has 16318 (5.3%) missing valuesMissing
location_city has 18056 (5.9%) missing valuesMissing
category has 121221 (39.5%) missing valuesMissing
summary has 119084 (38.8%) missing valuesMissing

Reproduction

Analysis started2025-12-05 13:53:31.843478
Analysis finished2025-12-05 13:53:57.723447
Duration25.88 seconds
Software versionydata-profiling vv4.18.0
Download configurationconfig.json

Variables

user_id
Real number (ℝ)

Distinct59803
Distinct (%)19.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean136128.42
Minimum8
Maximum278854
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2025-12-05T13:53:57.804504image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile11676
Q167591
median134076
Q3206438
95-th percentile263107
Maximum278854
Range278846
Interquartile range (IQR)138847

Descriptive statistics

Standard deviation80512.194
Coefficient of variation (CV)0.59144297
Kurtosis-1.2075485
Mean136128.42
Median Absolute Deviation (MAD)69895
Skewness0.043939502
Sum4.1763517 × 1010
Variance6.4822134 × 109
MonotonicityNot monotonic
2025-12-05T13:53:57.937977image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
116765520
 
1.8%
983914560
 
1.5%
1898351503
 
0.5%
1536621496
 
0.5%
23902956
 
0.3%
235105812
 
0.3%
76499810
 
0.3%
171118771
 
0.3%
16795760
 
0.2%
248718747
 
0.2%
Other values (59793)288860
94.2%
ValueCountFrequency (%)
87
< 0.1%
91
 
< 0.1%
121
 
< 0.1%
142
 
< 0.1%
161
 
< 0.1%
172
 
< 0.1%
191
 
< 0.1%
221
 
< 0.1%
262
 
< 0.1%
321
 
< 0.1%
ValueCountFrequency (%)
2788543
 
< 0.1%
2788521
 
< 0.1%
27885112
< 0.1%
2788491
 
< 0.1%
2788461
 
< 0.1%
2788441
 
< 0.1%
27884313
< 0.1%
2788322
 
< 0.1%
2788311
 
< 0.1%
2788281
 
< 0.1%

isbn
Text

Distinct129777
Distinct (%)42.3%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
2025-12-05T13:53:58.250878image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters3067950
Distinct characters36
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique88392 ?
Unique (%)28.8%

Sample

1st row0002005018
2nd row0002005018
3rd row0002005018
4th row0002005018
5th row0002005018
ValueCountFrequency (%)
0316666343566
 
0.2%
0971880107465
 
0.2%
0385504209390
 
0.1%
0312195516307
 
0.1%
0060928336256
 
0.1%
059035342x251
 
0.1%
0142001740246
 
0.1%
0446672211236
 
0.1%
044023722x225
 
0.1%
0452282152223
 
0.1%
Other values (129767)303630
99.0%
2025-12-05T13:53:58.681032image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0570794
18.6%
4322070
10.5%
1315757
10.3%
5306897
10.0%
3304329
9.9%
2262973
8.6%
7253403
8.3%
6253281
8.3%
8247669
8.1%
9205399
 
6.7%
Other values (26)25378
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)3067950
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0570794
18.6%
4322070
10.5%
1315757
10.3%
5306897
10.0%
3304329
9.9%
2262973
8.6%
7253403
8.3%
6253281
8.3%
8247669
8.1%
9205399
 
6.7%
Other values (26)25378
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)3067950
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0570794
18.6%
4322070
10.5%
1315757
10.3%
5306897
10.0%
3304329
9.9%
2262973
8.6%
7253403
8.3%
6253281
8.3%
8247669
8.1%
9205399
 
6.7%
Other values (26)25378
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)3067950
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0570794
18.6%
4322070
10.5%
1315757
10.3%
5306897
10.0%
3304329
9.9%
2262973
8.6%
7253403
8.3%
6253281
8.3%
8247669
8.1%
9205399
 
6.7%
Other values (26)25378
 
0.8%

rating
Real number (ℝ)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.0697143
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2025-12-05T13:53:58.785092image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q16
median8
Q39
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.4332165
Coefficient of variation (CV)0.34417466
Kurtosis0.22807537
Mean7.0697143
Median Absolute Deviation (MAD)1
Skewness-0.99433358
Sum2168953
Variance5.9205426
MonotonicityNot monotonic
2025-12-05T13:53:58.872987image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
873593
24.0%
752928
17.3%
948673
15.9%
1042774
13.9%
625311
 
8.3%
514111
 
4.6%
113249
 
4.3%
212929
 
4.2%
412707
 
4.1%
310520
 
3.4%
ValueCountFrequency (%)
113249
 
4.3%
212929
 
4.2%
310520
 
3.4%
412707
 
4.1%
514111
 
4.6%
625311
 
8.3%
752928
17.3%
873593
24.0%
948673
15.9%
1042774
13.9%
ValueCountFrequency (%)
1042774
13.9%
948673
15.9%
873593
24.0%
752928
17.3%
625311
 
8.3%
514111
 
4.6%
412707
 
4.1%
310520
 
3.4%
212929
 
4.2%
113249
 
4.3%
Distinct16831
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
2025-12-05T13:53:59.048928image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length69
Median length56
Mean length25.095402
Min length3

Characters and Unicode

Total characters7699144
Distinct characters86
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6574 ?
Unique (%)2.1%

Sample

1st rowtimmins, ontario, canada
2nd rowtoronto, ontario, canada
3rd rowkingston, ontario, canada
4th rowcomber, ontario, canada
5th rowguelph, ontario, canada
ValueCountFrequency (%)
usa209738
 
19.9%
new29157
 
2.8%
california28921
 
2.7%
canada28423
 
2.7%
n/a21398
 
2.0%
york13502
 
1.3%
ontario12999
 
1.2%
texas12267
 
1.2%
united11935
 
1.1%
kingdom11833
 
1.1%
Other values (11194)672944
63.9%
2025-12-05T13:53:59.406538image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a956473
12.4%
746488
 
9.7%
,614860
 
8.0%
n576154
 
7.5%
s529986
 
6.9%
i490487
 
6.4%
o459933
 
6.0%
e443403
 
5.8%
r395241
 
5.1%
u350111
 
4.5%
Other values (76)2136008
27.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)7699144
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a956473
12.4%
746488
 
9.7%
,614860
 
8.0%
n576154
 
7.5%
s529986
 
6.9%
i490487
 
6.4%
o459933
 
6.0%
e443403
 
5.8%
r395241
 
5.1%
u350111
 
4.5%
Other values (76)2136008
27.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)7699144
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a956473
12.4%
746488
 
9.7%
,614860
 
8.0%
n576154
 
7.5%
s529986
 
6.9%
i490487
 
6.4%
o459933
 
6.0%
e443403
 
5.8%
r395241
 
5.1%
u350111
 
4.5%
Other values (76)2136008
27.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)7699144
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a956473
12.4%
746488
 
9.7%
,614860
 
8.0%
n576154
 
7.5%
s529986
 
6.9%
i490487
 
6.4%
o459933
 
6.0%
e443403
 
5.8%
r395241
 
5.1%
u350111
 
4.5%
Other values (76)2136008
27.7%

age
Real number (ℝ)

High correlation 

Distinct91
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.348151
Minimum5
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2025-12-05T13:53:59.540401image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile21
Q129
median29
Q340
95-th percentile56
Maximum99
Range94
Interquartile range (IQR)11

Descriptive statistics

Standard deviation10.847369
Coefficient of variation (CV)0.31580648
Kurtosis1.2357685
Mean34.348151
Median Absolute Deviation (MAD)4
Skewness1.0707524
Sum10537841
Variance117.66541
MonotonicityNot monotonic
2025-12-05T13:53:59.675132image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29101747
33.2%
338084
 
2.6%
287724
 
2.5%
327595
 
2.5%
527583
 
2.5%
347559
 
2.5%
317443
 
2.4%
307090
 
2.3%
266660
 
2.2%
356357
 
2.1%
Other values (81)138953
45.3%
ValueCountFrequency (%)
582
 
< 0.1%
68
 
< 0.1%
754
 
< 0.1%
8210
 
0.1%
9269
 
0.1%
1067
 
< 0.1%
11147
 
< 0.1%
12271
 
0.1%
13441
 
0.1%
141171
0.4%
ValueCountFrequency (%)
993
 
< 0.1%
981
 
< 0.1%
9739
< 0.1%
962
 
< 0.1%
941
 
< 0.1%
939
 
< 0.1%
921
 
< 0.1%
9032
< 0.1%
891
 
< 0.1%
861
 
< 0.1%

location_country
Text

Missing 

Distinct208
Distinct (%)0.1%
Missing13975
Missing (%)4.6%
Memory size2.3 MiB
2025-12-05T13:53:59.938871image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length35
Median length3
Mean length4.3936104
Min length2

Characters and Unicode

Total characters1286537
Distinct characters28
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique60 ?
Unique (%)< 0.1%

Sample

1st rowcanada
2nd rowcanada
3rd rowcanada
4th rowcanada
5th rowcanada
ValueCountFrequency (%)
usa209717
68.3%
canada28406
 
9.3%
united11927
 
3.9%
kingdom11826
 
3.9%
germany9732
 
3.2%
spain5785
 
1.9%
australia5603
 
1.8%
france3708
 
1.2%
portugal2766
 
0.9%
malaysia1705
 
0.6%
Other values (232)15857
 
5.2%
2025-12-05T13:54:00.312824image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a354870
27.6%
u232363
18.1%
s228924
17.8%
n84710
 
6.6%
d58450
 
4.5%
i47115
 
3.7%
e37131
 
2.9%
c33698
 
2.6%
r29656
 
2.3%
t26495
 
2.1%
Other values (18)153125
11.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)1286537
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a354870
27.6%
u232363
18.1%
s228924
17.8%
n84710
 
6.6%
d58450
 
4.5%
i47115
 
3.7%
e37131
 
2.9%
c33698
 
2.6%
r29656
 
2.3%
t26495
 
2.1%
Other values (18)153125
11.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1286537
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a354870
27.6%
u232363
18.1%
s228924
17.8%
n84710
 
6.6%
d58450
 
4.5%
i47115
 
3.7%
e37131
 
2.9%
c33698
 
2.6%
r29656
 
2.3%
t26495
 
2.1%
Other values (18)153125
11.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1286537
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a354870
27.6%
u232363
18.1%
s228924
17.8%
n84710
 
6.6%
d58450
 
4.5%
i47115
 
3.7%
e37131
 
2.9%
c33698
 
2.6%
r29656
 
2.3%
t26495
 
2.1%
Other values (18)153125
11.9%

location_state
Text

Missing 

Distinct1405
Distinct (%)0.5%
Missing16318
Missing (%)5.3%
Memory size2.3 MiB
2025-12-05T13:54:00.584771image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length48
Median length29
Mean length8.6969571
Min length1

Characters and Unicode

Total characters2526266
Distinct characters28
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique526 ?
Unique (%)0.2%

Sample

1st rowontario
2nd rowontario
3rd rowontario
4th rowontario
5th rowontario
ValueCountFrequency (%)
california28910
 
8.5%
new23440
 
6.9%
ontario12980
 
3.8%
texas12261
 
3.6%
georgia10617
 
3.1%
york10586
 
3.1%
florida8928
 
2.6%
virginia8458
 
2.5%
illinois8409
 
2.5%
washington8080
 
2.4%
Other values (1440)209450
61.2%
2025-12-05T13:54:00.985901image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a328794
13.0%
i286430
11.3%
n257543
10.2%
o220023
 
8.7%
r179147
 
7.1%
e175209
 
6.9%
s153991
 
6.1%
l130034
 
5.1%
t108242
 
4.3%
c88887
 
3.5%
Other values (18)597966
23.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)2526266
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a328794
13.0%
i286430
11.3%
n257543
10.2%
o220023
 
8.7%
r179147
 
7.1%
e175209
 
6.9%
s153991
 
6.1%
l130034
 
5.1%
t108242
 
4.3%
c88887
 
3.5%
Other values (18)597966
23.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2526266
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a328794
13.0%
i286430
11.3%
n257543
10.2%
o220023
 
8.7%
r179147
 
7.1%
e175209
 
6.9%
s153991
 
6.1%
l130034
 
5.1%
t108242
 
4.3%
c88887
 
3.5%
Other values (18)597966
23.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2526266
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a328794
13.0%
i286430
11.3%
n257543
10.2%
o220023
 
8.7%
r179147
 
7.1%
e175209
 
6.9%
s153991
 
6.1%
l130034
 
5.1%
t108242
 
4.3%
c88887
 
3.5%
Other values (18)597966
23.7%

location_city
Text

Missing 

Distinct11189
Distinct (%)3.9%
Missing18056
Missing (%)5.9%
Memory size2.3 MiB
2025-12-05T13:54:01.284068image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length42
Median length32
Mean length8.6337038
Min length1

Characters and Unicode

Total characters2492887
Distinct characters28
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3796 ?
Unique (%)1.3%

Sample

1st rowtimmins
2nd rowtoronto
3rd rowkingston
4th rowcomber
5th rowguelph
ValueCountFrequency (%)
san6943
 
1.9%
toronto4825
 
1.3%
city4779
 
1.3%
morrow4565
 
1.3%
st4302
 
1.2%
london3249
 
0.9%
beach3057
 
0.8%
chicago2727
 
0.8%
louis2640
 
0.7%
seattle2468
 
0.7%
Other values (10011)320385
89.0%
2025-12-05T13:54:01.699605image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a242821
 
9.7%
e223862
 
9.0%
o215292
 
8.6%
n204761
 
8.2%
r177959
 
7.1%
l177124
 
7.1%
i151734
 
6.1%
t151004
 
6.1%
s143746
 
5.8%
c88824
 
3.6%
Other values (18)715760
28.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)2492887
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a242821
 
9.7%
e223862
 
9.0%
o215292
 
8.6%
n204761
 
8.2%
r177959
 
7.1%
l177124
 
7.1%
i151734
 
6.1%
t151004
 
6.1%
s143746
 
5.8%
c88824
 
3.6%
Other values (18)715760
28.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2492887
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a242821
 
9.7%
e223862
 
9.0%
o215292
 
8.6%
n204761
 
8.2%
r177959
 
7.1%
l177124
 
7.1%
i151734
 
6.1%
t151004
 
6.1%
s143746
 
5.8%
c88824
 
3.6%
Other values (18)715760
28.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2492887
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a242821
 
9.7%
e223862
 
9.0%
o215292
 
8.6%
n204761
 
8.2%
r177959
 
7.1%
l177124
 
7.1%
i151734
 
6.1%
t151004
 
6.1%
s143746
 
5.8%
c88824
 
3.6%
Other values (18)715760
28.7%

age_range
Real number (ℝ)

High correlation 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.438436
Minimum0
Maximum90
Zeros623
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2025-12-05T13:54:01.799444image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile20
Q120
median20
Q340
95-th percentile50
Maximum90
Range90
Interquartile range (IQR)20

Descriptive statistics

Standard deviation12.028887
Coefficient of variation (CV)0.42297989
Kurtosis0.69494486
Mean28.438436
Median Absolute Deviation (MAD)10
Skewness1.0224645
Sum8724770
Variance144.69411
MonotonicityNot monotonic
2025-12-05T13:54:01.885773image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
20148211
48.3%
3066874
21.8%
4042891
 
14.0%
5027018
 
8.8%
1012101
 
3.9%
607275
 
2.4%
701476
 
0.5%
0623
 
0.2%
80238
 
0.1%
9088
 
< 0.1%
ValueCountFrequency (%)
0623
 
0.2%
1012101
 
3.9%
20148211
48.3%
3066874
21.8%
4042891
 
14.0%
5027018
 
8.8%
607275
 
2.4%
701476
 
0.5%
80238
 
0.1%
9088
 
< 0.1%
ValueCountFrequency (%)
9088
 
< 0.1%
80238
 
0.1%
701476
 
0.5%
607275
 
2.4%
5027018
 
8.8%
4042891
 
14.0%
3066874
21.8%
20148211
48.3%
1012101
 
3.9%
0623
 
0.2%
Distinct117729
Distinct (%)38.4%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
2025-12-05T13:54:02.202859image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length256
Median length194
Mean length33.361896
Min length1

Characters and Unicode

Total characters10235263
Distinct characters125
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78563 ?
Unique (%)25.6%

Sample

1st rowClara Callan
2nd rowClara Callan
3rd rowClara Callan
4th rowClara Callan
5th rowClara Callan
ValueCountFrequency (%)
the142812
 
8.4%
of69185
 
4.1%
a54731
 
3.2%
and32097
 
1.9%
27409
 
1.6%
in19316
 
1.1%
to19075
 
1.1%
novel17390
 
1.0%
book15565
 
0.9%
for12977
 
0.8%
Other values (60810)1295235
75.9%
2025-12-05T13:54:02.681945image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1401284
 
13.7%
e980639
 
9.6%
o614142
 
6.0%
a563789
 
5.5%
r536671
 
5.2%
i526747
 
5.1%
n511925
 
5.0%
t475072
 
4.6%
s422464
 
4.1%
l329905
 
3.2%
Other values (115)3872625
37.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)10235263
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1401284
 
13.7%
e980639
 
9.6%
o614142
 
6.0%
a563789
 
5.5%
r536671
 
5.2%
i526747
 
5.1%
n511925
 
5.0%
t475072
 
4.6%
s422464
 
4.1%
l329905
 
3.2%
Other values (115)3872625
37.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)10235263
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1401284
 
13.7%
e980639
 
9.6%
o614142
 
6.0%
a563789
 
5.5%
r536671
 
5.2%
i526747
 
5.1%
n511925
 
5.0%
t475072
 
4.6%
s422464
 
4.1%
l329905
 
3.2%
Other values (115)3872625
37.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)10235263
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1401284
 
13.7%
e980639
 
9.6%
o614142
 
6.0%
a563789
 
5.5%
r536671
 
5.2%
i526747
 
5.1%
n511925
 
5.0%
t475072
 
4.6%
s422464
 
4.1%
l329905
 
3.2%
Other values (115)3872625
37.8%
Distinct54715
Distinct (%)17.8%
Missing1
Missing (%)< 0.1%
Memory size2.3 MiB
2025-12-05T13:54:02.986631image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length122
Median length66
Mean length13.750155
Min length1

Characters and Unicode

Total characters4218465
Distinct characters104
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32214 ?
Unique (%)10.5%

Sample

1st rowRichard Bruce Wright
2nd rowRichard Bruce Wright
3rd rowRichard Bruce Wright
4th rowRichard Bruce Wright
5th rowRichard Bruce Wright
ValueCountFrequency (%)
john9936
 
1.5%
james6038
 
0.9%
stephen5582
 
0.8%
robert5417
 
0.8%
michael5191
 
0.8%
j5134
 
0.8%
david4519
 
0.7%
r4442
 
0.7%
anne4339
 
0.6%
king4134
 
0.6%
Other values (31373)616860
91.9%
2025-12-05T13:54:03.453992image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e372027
 
8.8%
365765
 
8.7%
a341756
 
8.1%
n285730
 
6.8%
r274488
 
6.5%
i230356
 
5.5%
o205156
 
4.9%
l196917
 
4.7%
t147587
 
3.5%
s136399
 
3.2%
Other values (94)1662284
39.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)4218465
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e372027
 
8.8%
365765
 
8.7%
a341756
 
8.1%
n285730
 
6.8%
r274488
 
6.5%
i230356
 
5.5%
o205156
 
4.9%
l196917
 
4.7%
t147587
 
3.5%
s136399
 
3.2%
Other values (94)1662284
39.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)4218465
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e372027
 
8.8%
365765
 
8.7%
a341756
 
8.1%
n285730
 
6.8%
r274488
 
6.5%
i230356
 
5.5%
o205156
 
4.9%
l196917
 
4.7%
t147587
 
3.5%
s136399
 
3.2%
Other values (94)1662284
39.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)4218465
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e372027
 
8.8%
365765
 
8.7%
a341756
 
8.1%
n285730
 
6.8%
r274488
 
6.5%
i230356
 
5.5%
o205156
 
4.9%
l196917
 
4.7%
t147587
 
3.5%
s136399
 
3.2%
Other values (94)1662284
39.4%

year_of_publication
Real number (ℝ)

High correlation 

Distinct92
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1995.6753
Minimum1376
Maximum2005
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2025-12-05T13:54:03.582047image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1376
5-th percentile1982
Q11993
median1997
Q32001
95-th percentile2003
Maximum2005
Range629
Interquartile range (IQR)8

Descriptive statistics

Standard deviation7.4128886
Coefficient of variation (CV)0.0037144764
Kurtosis324.33089
Mean1995.6753
Median Absolute Deviation (MAD)4
Skewness-5.7607306
Sum6.1226319 × 108
Variance54.950917
MonotonicityNot monotonic
2025-12-05T13:54:03.715344image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
200230311
 
9.9%
200125818
 
8.4%
200323326
 
7.6%
199923255
 
7.6%
200022669
 
7.4%
199819734
 
6.4%
199417888
 
5.8%
199717475
 
5.7%
199617102
 
5.6%
199515249
 
5.0%
Other values (82)93968
30.6%
ValueCountFrequency (%)
13761
 
< 0.1%
13781
 
< 0.1%
19001
 
< 0.1%
19014
 
< 0.1%
19022
 
< 0.1%
19041
 
< 0.1%
19061
 
< 0.1%
19083
 
< 0.1%
19115
 
< 0.1%
192030
< 0.1%
ValueCountFrequency (%)
200542
 
< 0.1%
20048073
 
2.6%
200323326
7.6%
200230311
9.9%
200125818
8.4%
200022669
7.4%
199923255
7.6%
199819734
6.4%
199717475
5.7%
199617102
5.6%
Distinct10408
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
2025-12-05T13:54:03.962604image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length121
Median length79
Mean length14.257582
Min length1

Characters and Unicode

Total characters4374155
Distinct characters111
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5031 ?
Unique (%)1.6%

Sample

1st rowHarperFlamingo Canada
2nd rowHarperFlamingo Canada
3rd rowHarperFlamingo Canada
4th rowHarperFlamingo Canada
5th rowHarperFlamingo Canada
ValueCountFrequency (%)
books84265
 
12.8%
publishing21832
 
3.3%
press16971
 
2.6%
bantam14284
 
2.2%
group13748
 
2.1%
12699
 
1.9%
penguin10426
 
1.6%
ballantine10266
 
1.6%
pocket10239
 
1.6%
company9832
 
1.5%
Other values (7896)451303
68.8%
2025-12-05T13:54:04.370463image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o391068
 
8.9%
349071
 
8.0%
e317590
 
7.3%
n283907
 
6.5%
r267263
 
6.1%
a267063
 
6.1%
s260192
 
5.9%
i239118
 
5.5%
l193585
 
4.4%
t174411
 
4.0%
Other values (101)1630887
37.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)4374155
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o391068
 
8.9%
349071
 
8.0%
e317590
 
7.3%
n283907
 
6.5%
r267263
 
6.1%
a267063
 
6.1%
s260192
 
5.9%
i239118
 
5.5%
l193585
 
4.4%
t174411
 
4.0%
Other values (101)1630887
37.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)4374155
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o391068
 
8.9%
349071
 
8.0%
e317590
 
7.3%
n283907
 
6.5%
r267263
 
6.1%
a267063
 
6.1%
s260192
 
5.9%
i239118
 
5.5%
l193585
 
4.4%
t174411
 
4.0%
Other values (101)1630887
37.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)4374155
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o391068
 
8.9%
349071
 
8.0%
e317590
 
7.3%
n283907
 
6.5%
r267263
 
6.1%
a267063
 
6.1%
s260192
 
5.9%
i239118
 
5.5%
l193585
 
4.4%
t174411
 
4.0%
Other values (101)1630887
37.3%

img_url
Text

Distinct129777
Distinct (%)42.3%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
2025-12-05T13:54:04.751682image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length60
Median length60
Mean length60
Min length60

Characters and Unicode

Total characters18407700
Distinct characters53
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique88392 ?
Unique (%)28.8%

Sample

1st rowhttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg
2nd rowhttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg
3rd rowhttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg
4th rowhttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg
5th rowhttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg
ValueCountFrequency (%)
http://images.amazon.com/images/p/0316666343.01.thumbzzz.jpg566
 
0.2%
http://images.amazon.com/images/p/0971880107.01.thumbzzz.jpg465
 
0.2%
http://images.amazon.com/images/p/0385504209.01.thumbzzz.jpg390
 
0.1%
http://images.amazon.com/images/p/0312195516.01.thumbzzz.jpg307
 
0.1%
http://images.amazon.com/images/p/0060928336.01.thumbzzz.jpg256
 
0.1%
http://images.amazon.com/images/p/059035342x.01.thumbzzz.jpg251
 
0.1%
http://images.amazon.com/images/p/0142001740.01.thumbzzz.jpg246
 
0.1%
http://images.amazon.com/images/p/0446672211.01.thumbzzz.jpg236
 
0.1%
http://images.amazon.com/images/p/044023722x.01.thumbzzz.jpg225
 
0.1%
http://images.amazon.com/images/p/0452282152.01.thumbzzz.jpg223
 
0.1%
Other values (129767)303630
99.0%
2025-12-05T13:54:05.227484image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/1533975
 
8.3%
.1533975
 
8.3%
m1227180
 
6.7%
a1227180
 
6.7%
Z920392
 
5.0%
g920385
 
5.0%
0877589
 
4.8%
1622552
 
3.4%
t613590
 
3.3%
o613590
 
3.3%
Other values (43)8317292
45.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)18407700
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
/1533975
 
8.3%
.1533975
 
8.3%
m1227180
 
6.7%
a1227180
 
6.7%
Z920392
 
5.0%
g920385
 
5.0%
0877589
 
4.8%
1622552
 
3.4%
t613590
 
3.3%
o613590
 
3.3%
Other values (43)8317292
45.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)18407700
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
/1533975
 
8.3%
.1533975
 
8.3%
m1227180
 
6.7%
a1227180
 
6.7%
Z920392
 
5.0%
g920385
 
5.0%
0877589
 
4.8%
1622552
 
3.4%
t613590
 
3.3%
o613590
 
3.3%
Other values (43)8317292
45.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)18407700
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
/1533975
 
8.3%
.1533975
 
8.3%
m1227180
 
6.7%
a1227180
 
6.7%
Z920392
 
5.0%
g920385
 
5.0%
0877589
 
4.8%
1622552
 
3.4%
t613590
 
3.3%
o613590
 
3.3%
Other values (43)8317292
45.2%

language
Categorical

Imbalance 

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
en
301366 
de
 
2226
es
 
1486
fr
 
1175
it
 
296
Other values (19)
 
246

Length

Max length5
Median length2
Mean length2.0000391
Min length2

Characters and Unicode

Total characters613602
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowen
2nd rowen
3rd rowen
4th rowen
5th rowen

Common Values

ValueCountFrequency (%)
en301366
98.2%
de2226
 
0.7%
es1486
 
0.5%
fr1175
 
0.4%
it296
 
0.1%
nl81
 
< 0.1%
pt56
 
< 0.1%
da43
 
< 0.1%
ca23
 
< 0.1%
ms10
 
< 0.1%
Other values (14)33
 
< 0.1%

Length

2025-12-05T13:54:05.345846image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
en301366
98.2%
de2226
 
0.7%
es1486
 
0.5%
fr1175
 
0.4%
it296
 
0.1%
nl81
 
< 0.1%
pt56
 
< 0.1%
da43
 
< 0.1%
ca23
 
< 0.1%
ms10
 
< 0.1%
Other values (14)33
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e305080
49.7%
n301452
49.1%
d2269
 
0.4%
s1496
 
0.2%
r1186
 
0.2%
f1176
 
0.2%
t352
 
0.1%
i297
 
< 0.1%
l86
 
< 0.1%
a74
 
< 0.1%
Other values (16)134
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)613602
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e305080
49.7%
n301452
49.1%
d2269
 
0.4%
s1496
 
0.2%
r1186
 
0.2%
f1176
 
0.2%
t352
 
0.1%
i297
 
< 0.1%
l86
 
< 0.1%
a74
 
< 0.1%
Other values (16)134
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)613602
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e305080
49.7%
n301452
49.1%
d2269
 
0.4%
s1496
 
0.2%
r1186
 
0.2%
f1176
 
0.2%
t352
 
0.1%
i297
 
< 0.1%
l86
 
< 0.1%
a74
 
< 0.1%
Other values (16)134
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)613602
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e305080
49.7%
n301452
49.1%
d2269
 
0.4%
s1496
 
0.2%
r1186
 
0.2%
f1176
 
0.2%
t352
 
0.1%
i297
 
< 0.1%
l86
 
< 0.1%
a74
 
< 0.1%
Other values (16)134
 
< 0.1%

category
Text

Missing 

Distinct3723
Distinct (%)2.0%
Missing121221
Missing (%)39.5%
Memory size2.3 MiB
2025-12-05T13:54:05.634315image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length116
Median length9
Mean length11.962279
Min length3

Characters and Unicode

Total characters2219888
Distinct characters95
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1858 ?
Unique (%)1.0%

Sample

1st row'Actresses'
2nd row'Actresses'
3rd row'Actresses'
4th row'Actresses'
5th row'Actresses'
ValueCountFrequency (%)
fiction123141
48.4%
15095
 
5.9%
juvenile13766
 
5.4%
biography7721
 
3.0%
autobiography7697
 
3.0%
humor3309
 
1.3%
science3291
 
1.3%
history2762
 
1.1%
religion2407
 
0.9%
body1765
 
0.7%
Other values (3556)73411
28.9%
2025-12-05T13:54:06.457369image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
'367076
16.5%
i336006
15.1%
o194748
 
8.8%
n176778
 
8.0%
t167164
 
7.5%
c152186
 
6.9%
F125946
 
5.7%
e86365
 
3.9%
68791
 
3.1%
r57207
 
2.6%
Other values (85)487621
22.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)2219888
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
'367076
16.5%
i336006
15.1%
o194748
 
8.8%
n176778
 
8.0%
t167164
 
7.5%
c152186
 
6.9%
F125946
 
5.7%
e86365
 
3.9%
68791
 
3.1%
r57207
 
2.6%
Other values (85)487621
22.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2219888
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
'367076
16.5%
i336006
15.1%
o194748
 
8.8%
n176778
 
8.0%
t167164
 
7.5%
c152186
 
6.9%
F125946
 
5.7%
e86365
 
3.9%
68791
 
3.1%
r57207
 
2.6%
Other values (85)487621
22.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2219888
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
'367076
16.5%
i336006
15.1%
o194748
 
8.8%
n176778
 
8.0%
t167164
 
7.5%
c152186
 
6.9%
F125946
 
5.7%
e86365
 
3.9%
68791
 
3.1%
r57207
 
2.6%
Other values (85)487621
22.0%

summary
Text

Missing 

Distinct70061
Distinct (%)37.3%
Missing119084
Missing (%)38.8%
Memory size2.3 MiB
2025-12-05T13:54:06.829258image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length374
Median length247
Mean length178.7346
Min length1

Characters and Unicode

Total characters33550451
Distinct characters374
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44441 ?
Unique (%)23.7%

Sample

1st rowIn a small town in Canada, Clara Callan reluctantly takes leave of her sister, Nora, who is bound for New York.
2nd rowIn a small town in Canada, Clara Callan reluctantly takes leave of her sister, Nora, who is bound for New York.
3rd rowIn a small town in Canada, Clara Callan reluctantly takes leave of her sister, Nora, who is bound for New York.
4th rowIn a small town in Canada, Clara Callan reluctantly takes leave of her sister, Nora, who is bound for New York.
5th rowIn a small town in Canada, Clara Callan reluctantly takes leave of her sister, Nora, who is bound for New York.
ValueCountFrequency (%)
the315278
 
5.8%
of218631
 
4.0%
a207181
 
3.8%
and192508
 
3.6%
to125891
 
2.3%
in104148
 
1.9%
her63811
 
1.2%
is58657
 
1.1%
his47179
 
0.9%
for45534
 
0.8%
Other values (106311)4039268
74.6%
2025-12-05T13:54:07.346500image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4834108
14.4%
e3199762
 
9.5%
a2140870
 
6.4%
t2126891
 
6.3%
o1968108
 
5.9%
i1966558
 
5.9%
n1961327
 
5.8%
r1826392
 
5.4%
s1774140
 
5.3%
h1279718
 
3.8%
Other values (364)10472577
31.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)33550451
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4834108
14.4%
e3199762
 
9.5%
a2140870
 
6.4%
t2126891
 
6.3%
o1968108
 
5.9%
i1966558
 
5.9%
n1961327
 
5.8%
r1826392
 
5.4%
s1774140
 
5.3%
h1279718
 
3.8%
Other values (364)10472577
31.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)33550451
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4834108
14.4%
e3199762
 
9.5%
a2140870
 
6.4%
t2126891
 
6.3%
o1968108
 
5.9%
i1966558
 
5.9%
n1961327
 
5.8%
r1826392
 
5.4%
s1774140
 
5.3%
h1279718
 
3.8%
Other values (364)10472577
31.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)33550451
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4834108
14.4%
e3199762
 
9.5%
a2140870
 
6.4%
t2126891
 
6.3%
o1968108
 
5.9%
i1966558
 
5.9%
n1961327
 
5.8%
r1826392
 
5.4%
s1774140
 
5.3%
h1279718
 
3.8%
Other values (364)10472577
31.2%
Distinct129777
Distinct (%)42.3%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
2025-12-05T13:54:07.681292image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length33
Median length33
Mean length33
Min length33

Characters and Unicode

Total characters10124235
Distinct characters46
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique88392 ?
Unique (%)28.8%

Sample

1st rowimages/0002005018.01.THUMBZZZ.jpg
2nd rowimages/0002005018.01.THUMBZZZ.jpg
3rd rowimages/0002005018.01.THUMBZZZ.jpg
4th rowimages/0002005018.01.THUMBZZZ.jpg
5th rowimages/0002005018.01.THUMBZZZ.jpg
ValueCountFrequency (%)
images/0316666343.01.thumbzzz.jpg566
 
0.2%
images/0971880107.01.thumbzzz.jpg465
 
0.2%
images/0385504209.01.thumbzzz.jpg390
 
0.1%
images/0312195516.01.thumbzzz.jpg307
 
0.1%
images/0060928336.01.thumbzzz.jpg256
 
0.1%
images/059035342x.01.thumbzzz.jpg251
 
0.1%
images/0142001740.01.thumbzzz.jpg246
 
0.1%
images/0446672211.01.thumbzzz.jpg236
 
0.1%
images/044023722x.01.thumbzzz.jpg225
 
0.1%
images/0452282152.01.thumbzzz.jpg223
 
0.1%
Other values (129767)303630
99.0%
2025-12-05T13:54:08.126638image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Z920392
 
9.1%
.920385
 
9.1%
0877589
 
8.7%
1622552
 
6.1%
g613590
 
6.1%
4322070
 
3.2%
5306897
 
3.0%
B306852
 
3.0%
U306806
 
3.0%
M306804
 
3.0%
Other values (36)4620298
45.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)10124235
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
Z920392
 
9.1%
.920385
 
9.1%
0877589
 
8.7%
1622552
 
6.1%
g613590
 
6.1%
4322070
 
3.2%
5306897
 
3.0%
B306852
 
3.0%
U306806
 
3.0%
M306804
 
3.0%
Other values (36)4620298
45.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)10124235
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
Z920392
 
9.1%
.920385
 
9.1%
0877589
 
8.7%
1622552
 
6.1%
g613590
 
6.1%
4322070
 
3.2%
5306897
 
3.0%
B306852
 
3.0%
U306806
 
3.0%
M306804
 
3.0%
Other values (36)4620298
45.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)10124235
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
Z920392
 
9.1%
.920385
 
9.1%
0877589
 
8.7%
1622552
 
6.1%
g613590
 
6.1%
4322070
 
3.2%
5306897
 
3.0%
B306852
 
3.0%
U306806
 
3.0%
M306804
 
3.0%
Other values (36)4620298
45.6%

publication_range
Real number (ℝ)

High correlation 

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1991.5698
Minimum1370
Maximum2000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2025-12-05T13:54:08.228328image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1370
5-th percentile1980
Q11990
median1990
Q32000
95-th percentile2000
Maximum2000
Range630
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.3420231
Coefficient of variation (CV)0.0041886673
Kurtosis205.42669
Mean1991.5698
Median Absolute Deviation (MAD)10
Skewness-4.0247758
Sum6.1100365 × 108
Variance69.58935
MonotonicityNot monotonic
2025-12-05T13:54:08.321776image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1990148678
48.5%
2000110239
35.9%
198037825
 
12.3%
19707581
 
2.5%
19601421
 
0.5%
1950799
 
0.3%
1940104
 
< 0.1%
192067
 
< 0.1%
193062
 
< 0.1%
190012
 
< 0.1%
Other values (2)7
 
< 0.1%
ValueCountFrequency (%)
13702
 
< 0.1%
190012
 
< 0.1%
19105
 
< 0.1%
192067
 
< 0.1%
193062
 
< 0.1%
1940104
 
< 0.1%
1950799
 
0.3%
19601421
 
0.5%
19707581
 
2.5%
198037825
12.3%
ValueCountFrequency (%)
2000110239
35.9%
1990148678
48.5%
198037825
 
12.3%
19707581
 
2.5%
19601421
 
0.5%
1950799
 
0.3%
1940104
 
< 0.1%
193062
 
< 0.1%
192067
 
< 0.1%
19105
 
< 0.1%

Interactions

2025-12-05T13:53:54.087425image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:50.469861image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:51.210819image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:51.920616image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:52.633668image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:53.340594image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:54.215332image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:50.595174image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:51.329172image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:52.039075image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:52.752956image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:53.464558image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:54.338911image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:50.712782image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:51.442892image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:52.154802image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:52.868503image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:53.588975image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:54.461565image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:50.831924image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:51.556847image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:52.267178image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:52.976716image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:53.713625image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:54.584147image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:50.951698image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:51.673298image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:52.385129image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:53.093494image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:53.834338image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:54.717781image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:51.078922image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:51.796448image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:52.509229image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:53.219872image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-05T13:53:53.959807image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-12-05T13:54:08.407249image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ageage_rangelanguagepublication_rangeratinguser_idyear_of_publication
age1.0000.9520.0170.0330.0410.0010.039
age_range0.9521.0000.0120.0350.0600.0100.042
language0.0170.0121.0000.5000.0100.0100.500
publication_range0.0330.0350.5001.0000.0030.0020.917
rating0.0410.0600.0100.0031.000-0.0130.007
user_id0.0010.0100.0100.002-0.0131.0000.001
year_of_publication0.0390.0420.5000.9170.0070.0011.000

Missing values

2025-12-05T13:53:55.380453image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-12-05T13:53:56.066426image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-12-05T13:53:57.132618image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

user_idisbnratinglocationagelocation_countrylocation_statelocation_cityage_rangebook_titlebook_authoryear_of_publicationpublisherimg_urllanguagecategorysummaryimg_pathpublication_range
0800020050184timmins, ontario, canada29.0canadaontariotimmins20.0Clara CallanRichard Bruce Wright2001.0HarperFlamingo Canadahttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpgen'Actresses'In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York.images/0002005018.01.THUMBZZZ.jpg2000.0
16754400020050187toronto, ontario, canada30.0canadaontariotoronto30.0Clara CallanRichard Bruce Wright2001.0HarperFlamingo Canadahttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpgen'Actresses'In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York.images/0002005018.01.THUMBZZZ.jpg2000.0
212362900020050188kingston, ontario, canada29.0canadaontariokingston20.0Clara CallanRichard Bruce Wright2001.0HarperFlamingo Canadahttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpgen'Actresses'In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York.images/0002005018.01.THUMBZZZ.jpg2000.0
320027300020050188comber, ontario, canada29.0canadaontariocomber20.0Clara CallanRichard Bruce Wright2001.0HarperFlamingo Canadahttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpgen'Actresses'In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York.images/0002005018.01.THUMBZZZ.jpg2000.0
421092600020050189guelph, ontario, canada29.0canadaontarioguelph20.0Clara CallanRichard Bruce Wright2001.0HarperFlamingo Canadahttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpgen'Actresses'In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York.images/0002005018.01.THUMBZZZ.jpg2000.0
521900800020050187halifax, nova scotia, canada60.0canadanova scotiahalifax60.0Clara CallanRichard Bruce Wright2001.0HarperFlamingo Canadahttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpgen'Actresses'In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York.images/0002005018.01.THUMBZZZ.jpg2000.0
626332500020050185fredericton, new brunswick, canada27.0canadanew brunswickfredericton20.0Clara CallanRichard Bruce Wright2001.0HarperFlamingo Canadahttp://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpgen'Actresses'In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York.images/0002005018.01.THUMBZZZ.jpg2000.0
7295400609731298wichita, kansas, usa71.0usakansaswichita70.0Decision in NormandyCarlo D'Este1991.0HarperPerennialhttp://images.amazon.com/images/P/0060973129.01.THUMBZZZ.jpgen'1940-1949'Here, for the first time in paperback, is an outstanding military\nhistory that offers a dramatic new perspective on the Allied campaign\nthat began with the invasion of the D-Day beaches of Normandy. Nationa\nadvertising in Military History.images/0060973129.01.THUMBZZZ.jpg1990.0
83570403741570656kansas city, missouri, usa53.0usamissourikansas city50.0Flu: The Story of the Great Influenza Pandemic of 1918 and the Search for the Virus That Caused ItGina Bari Kolata1999.0Farrar Straus Girouxhttp://images.amazon.com/images/P/0374157065.01.THUMBZZZ.jpgen'Medical'Describes the great flu epidemic of 1918, an outbreak that killed some\nforty million people worldwide, and discusses the efforts of\nscientists and public health officials to understand and prevent\nanother lethal pandemicimages/0374157065.01.THUMBZZZ.jpg1990.0
9110912037415706510milpitas, california, usa36.0usacaliforniamilpitas30.0Flu: The Story of the Great Influenza Pandemic of 1918 and the Search for the Virus That Caused ItGina Bari Kolata1999.0Farrar Straus Girouxhttp://images.amazon.com/images/P/0374157065.01.THUMBZZZ.jpgen'Medical'Describes the great flu epidemic of 1918, an outbreak that killed some\nforty million people worldwide, and discusses the efforts of\nscientists and public health officials to understand and prevent\nanother lethal pandemicimages/0374157065.01.THUMBZZZ.jpg1990.0
user_idisbnratinglocationagelocation_countrylocation_statelocation_cityage_rangebook_titlebook_authoryear_of_publicationpublisherimg_urllanguagecategorysummaryimg_pathpublication_range
306785278637202054296X7strasbourg, alsace, france29.0francealsacestrasbourg20.0L'Envoi Des AngesMicheal Connelly2002.0Distribooks Inchttp://images.amazon.com/images/P/202054296X.01.THUMBZZZ.jpgenNaNNaNimages/202054296X.01.THUMBZZZ.jpg2000.0
30678627864804492252083las vegas, nevada, usa29.0usanevadalas vegas20.0The Christmas SpiritPatricia Wynn1996.0Ivy Bookshttp://images.amazon.com/images/P/0449225208.01.THUMBZZZ.jpgen'Fiction'Taking human form as part of a wager, mischievous elf Trudy lures Sir\nMatthew Dunstone into her world of magic, unexpectedly falls in love\nwith him, and fears her deception will make him despise her. Original.images/0449225208.01.THUMBZZZ.jpg1990.0
306787278659034533029310vancouver, washington, usa33.0usawashingtonvancouver30.0Town Like AliceNevil Shute1981.0Ballantine Bookshttp://images.amazon.com/images/P/0345330293.01.THUMBZZZ.jpgenNaNNaNimages/0345330293.01.THUMBZZZ.jpg1980.0
30678827871306705289517albuquerque, new mexico, usa63.0usanew mexicoalbuquerque60.0Orson WellesBarbara Leaming1985.0Penguin USAhttp://images.amazon.com/images/P/0670528951.01.THUMBZZZ.jpgen'Biography & Autobiography'Based on two years of interviews and research, this biography portrays\nthe flamboyant American genius onstage, behind the camera, in love,\nand under the gunimages/0670528951.01.THUMBZZZ.jpg1980.0
30678927884306898189047pismo beach, california, usa28.0usacaliforniapismo beach20.0My Grandmother's JourneyJohn Cech1998.0Aladdinhttp://images.amazon.com/images/P/0689818904.01.THUMBZZZ.jpgen'Juvenile Fiction'A grandmother tells the story of her eventful life in early twentieth-\ncentury Europe and her arrival in the United States after World War\nII.images/0689818904.01.THUMBZZZ.jpg1990.0
30679027884307435254937pismo beach, california, usa28.0usacaliforniapismo beach20.0The Motley Fool's What To Do with Your Money Now : Ten Steps to Staying Up in a Down Market (Motley Fool)David Gardner2002.0Simon & Schuster Audiohttp://images.amazon.com/images/P/0743525493.01.THUMBZZZ.jpgenNaNNaNimages/0743525493.01.THUMBZZZ.jpg2000.0
306791278851067161746X6dallas, texas, usa33.0usatexasdallas30.0The Bachelor Home Companion: A Practical Guide to Keeping House Like a PigP.J. O'Rourke1987.0Pocket Bookshttp://images.amazon.com/images/P/067161746X.01.THUMBZZZ.jpgen'Humor'A tongue-in-cheek survival guide for single people reveals the\nquintessential secrets of no-fuss housekeepingimages/067161746X.01.THUMBZZZ.jpg1980.0
30679227885108841592217dallas, texas, usa33.0usatexasdallas30.0Why stop?: A guide to Texas historical roadside markersClaude Dooley1985.0Lone Star Bookshttp://images.amazon.com/images/P/0884159221.01.THUMBZZZ.jpgenNaNNaNimages/0884159221.01.THUMBZZZ.jpg1980.0
30679327885109123330227dallas, texas, usa33.0usatexasdallas30.0The Are You Being Served? Stories: 'Camping In' and Other FiascoesJeremy Lloyd1997.0Kqed Bookshttp://images.amazon.com/images/P/0912333022.01.THUMBZZZ.jpgen'Fiction'These hilarious stories by the creator of public television&#39;s\nlongest-running hit series capture the wacky sensibility and off-the-\nwall humor of the British sitcom.images/0912333022.01.THUMBZZZ.jpg1990.0
306794278851156966105710dallas, texas, usa33.0usatexasdallas30.0Dallas Street Map Guide and Directory, 2000 EditionMapsco1999.0American Map Corporationhttp://images.amazon.com/images/P/1569661057.01.THUMBZZZ.jpgenNaNNaNimages/1569661057.01.THUMBZZZ.jpg1990.0